Skip to content

Conversation

@zbowling
Copy link

@zbowling zbowling commented Jan 1, 2026

This PR contains three fixes for the MT7925 driver that resolve kernel panics and system deadlocks:

  1. NULL pointer dereference fix - Add NULL checks in vif iteration loops
  2. Reset/ROC mutex fix - Add mutex protection in reset work and suspend path
  3. Runtime PM/MLO PM mutex fix - Add mutex protection in PM functions

These bugs cause:

  • Kernel panics during WiFi reset/recovery
  • System-wide hangs during network switching or BSSID roaming
  • Deadlocks during suspend/resume

Tested on Framework Desktop (AMD Ryzen AI Max 300) with MT7925 WiFi.

The same bugs likely exist in mt7921 driver (predecessor) and should be addressed in a follow-up patch series.

Related:

Add NULL checks for bss_conf in all loops that iterate over valid_links
and call mt792x_vif_to_bss_conf(). This prevents kernel panics when the
link configuration in mac80211 is not yet set up even though the driver's
valid_links bitmap has the link marked as valid.

This can happen during HW reset when link state is inconsistent, or during
MLO operations where the driver's link tracking is ahead of mac80211's
BSS configuration.

Reported-by: Zac Bowling <zac@zacbowling.com>
Tested-by: Zac Bowling <zac@zacbowling.com>
Signed-off-by: Zac Bowling <zac@zacbowling.com>
During firmware recovery and ROC (Remain On Channel) abort operations,
the driver iterates over active interfaces and calls MCU functions that
require the device mutex to be held, but the mutex was not acquired.

This causes system-wide hangs where network commands hang indefinitely,
processes get stuck in uninterruptible sleep (D state), and the system
becomes completely unresponsive requiring force reboot.

Add mutex protection around interface iteration in:
- mt7925_mac_reset_work(): Called during firmware recovery after MCU
  timeouts to reconnect all interfaces
- PCI suspend path: Wrap mt7925_roc_abort_sync() call with mutex

Note: The mutex is added at the call site in pci.c rather than inside
mt7925_roc_abort_sync() because this function is also called from the
station remove path which already holds the mutex.

Reported-by: Zac Bowling <zac@zacbowling.com>
Tested-by: Zac Bowling <zac@zacbowling.com>
Signed-off-by: Zac Bowling <zac@zacbowling.com>
…O PM

Two additional code paths iterate over active interfaces and call MCU
functions without proper mutex protection:

1. mt7925_set_runtime_pm(): Called when runtime PM settings change.
   The ieee80211_iterate_active_interfaces() call invokes
   mt7925_pm_interface_iter() which calls mt7925_mcu_set_beacon_filter().

2. mt7925_mlo_pm_work(): Workqueue function for MLO power management.
   The iterator callback mt7925_mlo_pm_iter() calls mt7925_mcu_uni_bss_ps().

Add mutex protection around the iterate calls in both functions. For
mt7925_mlo_pm_iter(), move the mutex from inside the callback to the
caller (mt7925_mlo_pm_work) for consistency with other patterns.

This matches the pattern used in the older mt7615 driver and other
wireless drivers like iwlwifi.

Reported-by: Zac Bowling <zac@zacbowling.com>
Tested-by: Zac Bowling <zac@zacbowling.com>
Signed-off-by: Zac Bowling <zac@zacbowling.com>
@zbowling
Copy link
Author

zbowling commented Jan 1, 2026

Fix #1027. I sent these same patches upstream to luks, but here they are as well under BSD3, too. These same bugs exist in MT7921 drivers and should likely be applied there as well, but I don't have that chipset personally, so I leave it to you folks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant